High-Quality Prosody Generation in Mandarin Text-to-Speech System
نویسندگان
چکیده
A text-to-speech (TTS) synthesizer is a computer-based system that can automatically read text aloud. Fujitsu is developing a Mandarin TTS system using state-of-the-art technologies. The prosodic structure of synthesized text provides important information for making synthetic speech produced by a TTS system more natural and understandable. This paper describes a global probability estimation method for predicting prosodic words, which are the lowest constituent of the prosodic structure. Experimental results for this method are very promising. They are better than those for our previous binary prosodic tree method in terms of both accuracy and memory cost.
منابع مشابه
Modular Design for Mandarin Text-to-speech Synthesis
In the European Union funded project Technology and Corpora for Speech-to-Speech Translation (TC-STAR) [3], we have developed a modular concatenative TTS system for Mandarin Chinese. A common architecture has been introduced based on well-defined modules and interfaces. Three main modules, text processing, prosody processing and acoustic synthesis modules, are used following a commonly employed...
متن کاملAn Example-based Approach for Prosody Generation in Chinese Speech Synthesis
Prosody generation is an important issue in text to speech system. We present in this paper an example-based approach for prosody generation in mandarin Chinese speech synthesis. The general idea is that we are trying to get the prosodic information from real speech examples. We first analyze given Chinese text, and form a linguistic feature vector, which describes the phonetic and lexicon char...
متن کاملUnsupervised prosody labeling for constructing Mandarin TTS
This paper introduces an unsupervised prosody labeling method for preparing a large speech corpus used in developing a Mandarin Text-to-Speech system. Adopting a four-layer prosody hierarchy, the proposed method performs an unsupervised segmental clustering that iteratively segments spoken utterances into strings of prosodic constituents and models the patterns of the segmented prosodic constit...
متن کاملOn Cross-Dialect and -Speaker Adaptation of Speaking Rate-Dependent Hierarchical Prosodic Model for a Hakka Text-to-Speech System
This paper presents an effective adaptation of an existing speaking rate-dependent hierarchical prosodic model (SRHPM) for Mandarin to construct the SR-HPM for Hakka, another Chinese dialect. Based on the cross-dialectal linguistic similarities in terms of syntactic and prosodic structures, the adaptation is formulated as a maximum a posteriori estimation (MAP) problem with the existing Mandari...
متن کاملAutomatic Prosody Generation in a Text-to-speech System for Hebrew
The paper presents the module for automatic prosody generation within a system for automatic synthesis of high-quality speech based on arbitrary text in Hebrew. The high quality of synthesis is due to the high accuracy of automatic prosody generation, enabling the introduction of elements of natural sentence prosody of Hebrew. Automatic morphological annotation of text is based on the applicati...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010